Reinforcement Learning
2° Year of course - Second semester
Frequency Not mandatory
- 6 CFU
- 48 hours
- INGLESE
- Trieste
- Opzionale
- Standard teaching
- Oral Exam
- SSD INF/01
In this course you will learn fundamental concepts and algorithms in
decision-making theory and Reinforcement Learning.
Knowledge and understanding: basic and advanced topics in Markov
Decision Processes, Dynamic Programming, Partially Observable Markov
Decision Processes, Model-free Reinforcement Learning, Temporal
Difference Methods, Multi-agent and Adversarial Reinforcement Learning.
Applying knowledge and understanding: learning how to use
Reinforcement Algorithms to identify optimal and approximately optimal
strategies for decision-making in simple and complex environments.
Making judgments: understanding how to translate a verbal description of a decision-making or control problem in a mathematical reinforcement learning setting.
Communication skills: being able to explain the basic ideas and
communicate the results to experts and non-experts.
Learning skills: being capable of exploring the literature, find alternative
approaches and combine them to solve complex problems.
Basic knowledge of Python and scientific Python. Basic knowledge of
probability, statistics, Bayesian inference and Markov processes.
• 1. Markov Decision Processes
• 2. Bellman’s optimality equation and Dynamic Programming
• 3. Partially Observable Markov Decision Processes
• 4. Model-free Reinforcement Learning
• 5. Temporal Difference Methods
• 6. Reinforcement Learning with function approximation
• 7. Critic-only, Actor-only and Actor-Critic architectures for Reinforcement Learning
• 8. Multi-agent Reinforcement Learning and Game Theory
Recommended
1. R.S. Sutton and A.G. Barto, Reinforcement Learning: an introduction.
Cambridge MA: MIT press, 2018 (2nd edition)
Other textbooks:
2. C. Szepeszvari, Algorithms for Reinforcement Learning, Morgan and
Claypool, 2010
3. D. Bertsekas, Reinforcement Learning and Optimal Control, Athena
Scientific 2019
Frontal lectures and hands on sessions, both individual and in groups.
The balance will be roughly 60% of frontal lectures and 40% of hands-on
sessions. Ideally, each lecture will have a part of frontal teaching and a
part of hands-on training (coding algorithms)
The exam will consist of two parts:
1. a group project work, in groups of 2 to 4 students. Each group will have
one or more tasks, typically finding (approximately) optimal strategies by
means of Reinforcement Learning; each group will have to write a short
report, provide commented code, and give a brief presentation explaining
the work done.
2. a short individual presentation of a topic not presented in the course,
and studied autonomously by the student.
During the presentations, few questions will be asked to assess the
individual contributions and preparation on the topics of the course.